AITopics | theorem 6

Collaborating Authors

theorem 6

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reward Transfer from Inverse Reinforcement Learning: A Coupled Minimax Approach

Hao, Guang-Yuan, van der Laan, Lars, Bibaut, Aurélien, Kallus, Nathan

arXiv.org Machine LearningMay-28-2026

Expert demonstrations, such as those from car drivers, help navigate environments with unknown rewards, but are often collected in controlled settings, such as closed-course test tracks, while learned control policies must be deployed in new environments, such as city streets. We can imitate experts to perform well in the same source environment where demonstrations are observed, and we may even use inverse reinforcement learning (IRL) to improve on simple behavior cloning (Ng and Russell, 2000; Abbeel and Ng, 2004; Ziebart et al., 2008; Fu et al., 2018; Geng et al., 2020). But the target environment may have a different transition law, discount factor, or soft-control regularization. For this, IRL is crucial: we can learn a reward from demonstrations in the source environment and transfer it to the target environment, learning a policy that optimizes the same reward function in a new setting (Fu et al., 2018; Schlaginhaufen and Kamgarpour, 2024). In this paper, we characterize how well this transfer can be done and which approaches are preferable. In particular, we show the value in a coupled approach that takes the target environment into account even when learning from the source. In ordinary offline control, the Bellman equation uses a known reward, so the main statistical error comes from target transitions.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2605.27834

Genre: Research Report (0.63)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Online Learning-to-Defer with Varying Experts

Duy, Dang Hoang, Montreuil, Yannis, Meyer, Maxime, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang

arXiv.org Machine LearningMay-21-2026

Learning-to-Defer (L2D) methods route each query either to a predictive model or to external experts. While existing work studies this problem in batch settings, real-world deployments require handling streaming data, changing expert availability, and shifting expert distribution. We introduce the first online L2D algorithm for multiclass classification with bandit feedback and a dynamically varying pool of experts. Our method achieves regret guarantees of $O((n+n_e)T^{2/3})$ in general and $O((n+n_e)\sqrt{T})$ under a low-noise condition, where $T$ is the time horizon, $n$ is the number of labels, and $n_e$ is the number of distinct experts observed across rounds. The analysis builds on novel $\mathcal{H}$-consistency bounds for the online framework, combined with first-order methods for online convex optimization. Experiments on synthetic and real-world datasets demonstrate that our approach effectively extends standard Learning-to-Defer to settings with varying expert availability and reliability.

artificial intelligence, def, machine learning, (15 more...)

arXiv.org Machine Learning

2605.1234

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.67)
Education > Educational Setting > Online (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Add feedback

A Theory of Saddle Escape in Deep Nonlinear Networks

Rawal, Divit, DeWeese, Michael R.

arXiv.org Machine LearningMay-11-2026

In deep networks with small initialization, training exhibits long plateaus separated by sharp feature-acquisition transitions. Whereas shallow nonlinear networks and deep linear networks are well studied, extending these analyses to deep nonlinear networks remains challenging. We derive an exact identity for the imbalance of Frobenius norms of layer weight matrices that holds for any smooth activation and any differentiable loss and use this to classify activation functions into four universality classes. On the permutation-symmetric submanifold, the identity combines with an approximate balance law to reduce the full matrix flow to a scalar ODE, giving a critical-depth escape time law $τ_\star = Θ(\varepsilon^{-(r-2)})$ governed by the number $r$ of layers at the bottleneck scale rather than the total depth $L$. We find that this same $r-2$ exponent is recovered under He-normal initialization with $r$ bottleneck layers rescaled by $\varepsilon$, where the symmetry manifold is preserved by the flow but not attracting. We find close agreement between our theory and numerical simulations.

artificial intelligence, machine learning, urlhttp, (18 more...)

arXiv.org Machine Learning

2605.01288

Country: North America > United States > California (0.28)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback

Convex-Geometric Error Bounds for Positive-Weight Kernel Quadrature

Hayakawa, Satoshi

arXiv.org Machine LearningMay-8-2026

Kernel quadrature (KQ) is a kernel-based approach to numerical integration, closely related to Bayesian quadrature (BQ) and probabilistic integration [38, 39, 10]. For sufficiently regular integrands, KQ can exploit spectral structure in a reproducing kernel Hilbert space (RKHS) that is invisible to plain Monte Carlo and thereby converge faster than the usual O(N 1/2) rate in the number of points [3, 28]. Unconstrained kernel-based rules, however, may produce numerically unstable weights, motivating longstanding interest in positively weighted rules [13, 21, 29, 46]. In this paper, positive weights mean nonnegative weights that sum to one, i.e., simplex or convex-combination weights. Whether positive-weight KQ can systematically improve over Monte Carlo is a subtle question. Kernel herding and related constructions suggested fast rates under favorable assumptions [13], but the conditional-gradient viewpoint of Bach et al. [4] clarified that the strongest such assumptions are not generally available in infinite-dimensional RKHSs. Subsequent herding-type analyses in broad RKHS settings have therefore mostly remained at the Monte-Carlo scale, except under additional structure or modified algorithms such as sparse herding variants [31, 44, 43]. Beyond herding, subsampling-based positive KQ methods such as thinning [16, 15] and recombination [21, 24] have obtained rates beyond Monte Carlo, but a general mechanism for such improvement in the simple i.i.d.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2605.05705

Country: Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Supplementary material for " Regret Bounds for Multilabel Classification in Sparse Label Regimes "

Neural Information Processing SystemsMay-1-2026, 01:52:54 GMT

This appendix contains all proofs of the results mentioned in the main body of the paper, plus further results which have been omitted there due to space limits. We recall the following lemma which upper bounds the probability measure of the ball around a point x X that contains its kth nearest neighbors. The proof immediately follows from the multiplicative Chernoff bound (see, e.g., Lemma 3.2 in [28]). When combined with Assumption 5.1 we obtain the following corollary. Corollary A.2. Suppose that the measure-smoothness assumption (Assumption 5.1) holds with parameters λ, Cλ, k k.

artificial intelligence, machine learning, probability, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Table

Neural Information Processing SystemsApr-30-2026, 01:35:33 GMT

It also tolerates no prediction errors on the labeled nodes, so the trade-off parameter can be set to infinity. Local and Global Consistency (LGC) [82] relaxes the GRF method by eliminating the restriction of zero empirical risk on labeled nodes and exploits the normalized Laplacian matrix for smoothing instead. Random Walk Smoothing [83] extends LRC for directed graphs by indirectly operating LGC on a modified undirected graph with a new normalized Laplacian matrix L . Tikhonov Smoothing [4] only uses the labeled nodes in the quadratic error term. Hub & Authority Smoothing [84] proposes another random-walk-based strategy on directed graphs that is motivated by the hub and authority web model. Its smoothing matrix is more complex with two underlying Laplacian matrices LA,LH for in-links and out-links.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Industry:

Health & Medicine (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Sufficient-Statistic Reduction of the Information Bottleneck to a Low-Dimensional Problem

Armstrong, Joss

arXiv.org Machine LearningApr-30-2026

We show that if the conditional distribution p(C | T) factors through a sufficient statistic ϕ(T), then the Information Bottleneck (IB) problem for (T, C) is exactly equivalent to the IB problem for (ϕ(T), C). The reduction is loss-free: it preserves the full IB curve, the Lagrangian optimum at every trade-off parameter \b{eta}, and the optimal representations up to pullback through ϕ. As a result, the computational complexity of solving the IB problem is governed by the dimension of the sufficient statistic rather than the ambient dimension of the source. This identifies an exact structural condition under which the generic IB problem becomes tractable, and gives a formal bridge between the discrete and linear-Gaussian regimes. We then show that the classical Gaussian IB solution of Chechik, Globerson, Tishby and Weiss is an immediate corollary of this reduction, and we state a nonlinear-Gaussian generalisation. A small numerical example illustrates the practical consequence: when a low-dimensional sufficient statistic is available, the exact IB curve can be computed on the reduced problem at a cost determined by the statistic rather than by the ambient source dimension.

artificial intelligence, machine learning, theorem 6, (16 more...)

arXiv.org Machine Learning

2604.26744

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

The optimal betting wealth growth rate

Ram, Ashwin, Ramdas, Aaditya

arXiv.org Machine LearningApr-29-2026

This paper characterizes the best possible rate of growth of wealth in a Kelly betting game when repeatedly betting against a general i.i.d. null hypothesis $\mathscr{P}$, but the data are drawn i.i.d from an arbitrary alternative $Q$. We prove that it equals $\lim_{n \to \infty}n^{-1}\inf_{P \in (\mathscr P)^n)^{\circ\circ}} \mathrm{KL}(Q^n,P)$, where ${\mathscr P}^n = \{P^n: P \in \mathscr{P}\}$ and $(\mathscr {P}^n)^{\circ\circ}$ is its bipolar, i.e., this rate is achievable and one cannot do better. This quantity is in general smaller than a more popular quantity in the literature, $\mathrm{KL}_{\inf}(Q,\mathscr{P}) := \inf_{P \in \mathscr P}\mathrm{KL}(Q,P)$. If $\mathrm{KL}_{\mathrm{inf}}(\cdot,\mathscr P)$ is weakly lowersemicontinuous (w.l.s.c.) at $Q$, we show that the two quantities are equal; in particular, this happens when $\mathscr P$ is weakly compact. For simple alternatives, we provide the first matching necessary and sufficient condition for when power-one sequential tests exist (without assumptions on $\mathscr P, Q$). We also derive the optimal worst-case growth rate against composite $\mathscr Q$. We emphasize that test supermartingales on reduced filtrations suffice for all i.i.d. testing problems, and more general e-processes are not required. We thus completely generalize the recent results of Larsson et al.~\cite{larsson2025numeraire} to the sequential setting.

artificial intelligence, klinf, test supermartingale, (17 more...)

arXiv.org Machine Learning

2604.2528

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.45)

Add feedback

in Fixed Dimension Training Neural Networks is NP-Hard

Neural Information Processing SystemsApr-28-2026, 22:36:26 GMT

Our results settle the complexity status regarding these parameters number of dimensions and number of ReLUs if the network is assumed to compute the ReLU case, we show fixed-parameter tractability for the combined parameter four ReLUs (or two linear threshold neurons) with zero training error. Finally, in We also answer a question by Froese et al. [2022, JAIR] proving W[1]-hardness for dimensions, which excludes any polynomial-time algorithm for constant dimension. Khalife and Basu [2022, IPCO] showing that both problems are NP-hard for two eral questions are still open. We answer questions by Arora et al. [2018, ICLR] and complexity of these problems has been studied numerous times in recent years, sevsidering ReLU and linear threshold activation functions.

artificial intelligence, machine learning, neural network, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback